19 research outputs found

    Intérprete y entorno de desarrollo para el aprendizaje de lenguajes de programación estructurada

    Get PDF
    Este proyecto tiene como objetivo principal el diseño, desarrollo e implementación de un intérprete de un lenguaje de programación que pueda ser usado en los primeros cursos de introducción a la computación. El trabajo muestra cómo se pueden crear intérpretes, lo que en nuestro país tiene escasa tradición, a diferencia de lo que ocurre en los países más desarrollados. Además, presenta un entorno de desarrollo integrado para facilitar la introducción a la programación, ofreciendo un ambiente amigable y un lenguaje de programación totalmente basado en el idioma español. En opinión de la autora esta segunda característica favorecerá a que el alumno entienda mejor el lenguaje y los procesos de computación. En el capítulo 1 del presente documento se presenta la descripción del problema de escoger un lenguaje adecuado para la enseñanza de los primeros cursos de programación, las opciones que tenemos en nuestra actualidad y una posible solución a este problema. En el capítulo 2 se formula una propuesta que resuelve el problema planteado en el capítulo 1 que permite definir el lenguaje, su funcionamiento y el entorno en el que se ha de ejecutar. El capítulo 3 presenta la implementación del intérprete y la del entorno, propuestos anteriormente. En el capítulo 4 se exponen las observaciones, conclusiones, recomendaciones y trabajos futuros, tanto del intérprete como del entorno.Tesi

    Solving the Structural Modeling Problems for Tandem Repeat Proteins

    Get PDF
    Over the last decade, numerous studies have demonstrated fundamental importance of tandem repeat proteins (TRP) in many biological processes (Andrade, Perez-Iratxeta, and Ponting 2001). Repeat proteins are a widespread class of non-globular proteins carrying heterogeneous functions involved in several diseases. One of the most frequent problems in the study of biology is the functional characterization of a protein. This problem is usually solved by analyzing the three-dimensional (3D) structure. The experimental determination of the 3D structure is time consuming and technically difficult. For this reason structure prediction by homology modeling offers a fast alternative to experimental approaches. However homology modeling is not feasible for tandem repeat proteins because it is difficult to infer homology due to a high degree of sequence degeneration. In this thesis, I focused on algorithms oriented toward repeat unit prediction, and characterization. I developed an innovative approach, Repeat Protein Unit Predictor (ReUPred), for fast automatic prediction of repeat units and repeat classification, exploiting a Structure Repeat Unit Library (SRUL) derived from RepeatsDB, the core database of TRP. ReUPred is based on the Victor C++ library, an open source platform dedicated to protein structure manipulation. To prove the accuracy of the predictor, we ran it against all the entries in the PDB database and the resulting predictions allowed us to improve and increase RepeatsDB annotation twenty times. During my PhD I have integrated ReUPred prediction into the new version of RepeatsDB (release 2.0) that now features information on start and end positions for the repeat regions and units for all entries. The updated web interface includes a new search engine for complex queries and a fully re-designed entry page for a better overview of structural data. To further improve RepeatsDB quality we decided to provide a finer classification at the subclass level based on the structural conformation of the repeated units. We hypothesized that inside these ensembles it is possible to find subgroups of proteins sharing the same unit type. To prove it, we performed a detailed structural analysis. We created a network where nodes are the units and arcs represent structural similarity. The network can be partitioned in 7 different clusters. For each cluster, it was possible to create a Hidden Markov Model similar to those representing Pfam domains. This analysis is an unpublished work but it already helped to improve ReUPred accuracy and RepeatsDB annotation. To summarize, this work is a partial answer to the problems of TRP modeling and might be helpful during future investigations such as drug design and disease studies

    Revenant: A database of resurrected proteins

    Get PDF
    Revenant is a database of resurrected proteins coming from extinct organisms. Currently, it contains a manually curated collection of 84 resurrected proteins derived from bibliographic data. Each protein is extensively annotated, including structural, biochemical and biophysical information. Revenant contains a browse capability designed as a timeline from where the different proteins can be accessed. The oldest Revenant entries are between 4200 and 3500 million years ago, while the younger entries are between 8.8 and 6.3 million years ago. These proteins have been resurrected using computational tools called ancestral sequence reconstruction techniques combined with wet-laboratory synthesis and expression. Resurrected proteins are commonly used, with a noticeable increase during the past years, to explore and test different evolutionary hypotheses such as protein stability, to explore the origin of new functions, to get biochemical insights into past metabolisms and to explore specificity and promiscuous behaviour of ancient proteins.Fil: Carletti, Matías Sebastian. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Monzon, Alexander. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentina. Università di Padova; ItaliaFil: Garcia Rios, Emilio. Pontificia Universidad Católica de Perú; PerúFil: Benítez, Guillermo Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Hirsh, Layla. Pontificia Universidad Católica de Perú; PerúFil: Fornasari, Maria Silvina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Parisi, Gustavo Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentin

    CoDNaS-RNA: a database of Conformational Diversity in the Native State of RNA

    Get PDF
    Conformational changes in RNA native ensembles are central to fulfill many of their biological roles. Systematic knowledge of the extent and possible modulators of this conformational diversity is desirable to better understand the relationship between RNA dynamics and function.We have developed CoDNaS-RNA as the first database of conformational diversity in RNA molecules. Known RNA structures are retrieved and clustered to identify alternative conformers of each molecule. Pairwise structural comparisons within each cluster allows to measure the variability of the molecule. Additional data on structural features, molecular interactions and functional annotations are provided. CoDNaS-RNA is implemented as a public resource that can be of much interest for computational and bench scientists alike.Fil: González Buitrón, Martín. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Romario Tunque Cahui, Ronaldo. Pontificia Universidad Católica de Perú; PerúFil: García Ríos, Emilio. Pontificia Universidad Católica de Perú; PerúFil: Hirsh, Layla. Pontificia Universidad Católica de Perú; PerúFil: Fornasari, Maria Silvina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Parisi, Gustavo Daniel. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Palopoli, Nicolás. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

    RepeatsDB in 2021: Improved data and extended classification for protein tandem repeat structures

    Get PDF
    The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.Fil: Paladin, Lisanna. Università di Padova; ItaliaFil: Bevilacqua, Martina. Università di Padova; ItaliaFil: Errigo, Sara. Università di Padova; ItaliaFil: Piovesan, Damiano. Università di Padova; ItaliaFil: Mičetić, Ivan. Università di Padova; ItaliaFil: Necci, Marco. Università di Padova; ItaliaFil: Monzon, Alexander Miguel. Università di Padova; ItaliaFil: Fabre, Maria Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Biotecnología y Biología Molecular. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Biotecnología y Biología Molecular; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Departamento de Ciencias Biológicas; ArgentinaFil: López, José Luis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Biotecnología y Biología Molecular. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Biotecnología y Biología Molecular; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Departamento de Ciencias Biológicas; ArgentinaFil: Nilsson, Juliet Fernanda. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Biotecnología y Biología Molecular. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Biotecnología y Biología Molecular; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Departamento de Ciencias Biológicas; ArgentinaFil: Ríos, Javier Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Lorenzano Menna, Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Cabrera, Maia Diana Eliana. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: González Buitrón, Martín. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Gonçalves Kulik, Mariane. Johannes Gutenberg Universitat Mainz; AlemaniaFil: Fernández Alberti, Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Fornasari, Maria Silvina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Parisi, Gustavo Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; ArgentinaFil: Lagares, Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Biotecnología y Biología Molecular. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Biotecnología y Biología Molecular; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Agrarias y Forestales. Departamento de Ciencias Biológicas; ArgentinaFil: Hirsh, Layla. Pontificia Universidad Católica de Perú; PerúFil: Andrade Navarro, Miguel A.. Johannes Gutenberg Universitat Mainz; AlemaniaFil: Kajava, Andrey V. Centre National de la Recherche Scientifique; FranciaFil: Tosatto, Silvio C E. Università di Padova; Itali

    RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures

    Get PDF
    The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.Facultad de Ciencias ExactasInstituto de Biotecnologia y Biologia Molecula

    Solving the Structural Modeling Problems for Tandem Repeat Proteins

    Get PDF
    Over the last decade, numerous studies have demonstrated fundamental importance of tandem repeat proteins (TRP) in many biological processes (Andrade, Perez-Iratxeta, and Ponting 2001). Repeat proteins are a widespread class of non-globular proteins carrying heterogeneous functions involved in several diseases. One of the most frequent problems in the study of biology is the functional characterization of a protein. This problem is usually solved by analyzing the three-dimensional (3D) structure. The experimental determination of the 3D structure is time consuming and technically difficult. For this reason structure prediction by homology modeling offers a fast alternative to experimental approaches. However homology modeling is not feasible for tandem repeat proteins because it is difficult to infer homology due to a high degree of sequence degeneration. In this thesis, I focused on algorithms oriented toward repeat unit prediction, and characterization. I developed an innovative approach, Repeat Protein Unit Predictor (ReUPred), for fast automatic prediction of repeat units and repeat classification, exploiting a Structure Repeat Unit Library (SRUL) derived from RepeatsDB, the core database of TRP. ReUPred is based on the Victor C++ library, an open source platform dedicated to protein structure manipulation. To prove the accuracy of the predictor, we ran it against all the entries in the PDB database and the resulting predictions allowed us to improve and increase RepeatsDB annotation twenty times. During my PhD I have integrated ReUPred prediction into the new version of RepeatsDB (release 2.0) that now features information on start and end positions for the repeat regions and units for all entries. The updated web interface includes a new search engine for complex queries and a fully re-designed entry page for a better overview of structural data. To further improve RepeatsDB quality we decided to provide a finer classification at the subclass level based on the structural conformation of the repeated units. We hypothesized that inside these ensembles it is possible to find subgroups of proteins sharing the same unit type. To prove it, we performed a detailed structural analysis. We created a network where nodes are the units and arcs represent structural similarity. The network can be partitioned in 7 different clusters. For each cluster, it was possible to create a Hidden Markov Model similar to those representing Pfam domains. This analysis is an unpublished work but it already helped to improve ReUPred accuracy and RepeatsDB annotation. To summarize, this work is a partial answer to the problems of TRP modeling and might be helpful during future investigations such as drug design and disease studies.Nell’ultima decade, numerosi studi hanno dimostrato il ruolo fondamentale svolto dalle proteine ripetute (TRP, tandem repeat proteins) in molti processi biologici (Andrade, Perez-Iratxeta, and Ponting 2001). Quella delle TRP è un’ampia classe di proteine non globulari, caratterizzate da una notevole eterogeneità di funzione e dall’essere coinvolte nella eziogenesi di numerose patologie. Una delle maggiori difficoltà che si incontrano nella moderna biologia è la caratterizzazione funzionale di proteine. Nella pratica standard, questo problema è affrontato analizzandone la struttura cristallografica (3D). Tuttavia, la determinazione della struttura tridimensionale è un processo molto lento e spesso inficiato da difficoltà tecniche. Per questa ragione, le tecniche computazionali di modellazione per omologia spesso offrono una alternativa praticabile all’approccio sperimentale. Tali tecniche però non sono di ausilio nello studio delle TRP. Ciò è dovuto all’impossibilità di poter inferire informazione evolutiva a causa di una ridotta conservazione di sequenza dell’unità ripetuta, a sua volta derivata da un elevato grado di degenerazione della sequenza primaria. In questo elaborato di tesi, mi sono focalizzata sullo sviluppo di un algoritmo orientato alla predizione di unità ripetute in proteine e alla loro caratterizzazione. Qui presento ReUPred (Repeat Protein Unit Predictor), un algoritmo innovativo per la predizione e caratterizzazione di unità proteiche ripetute basato sulla “libreria di unità strutturali ripetute” (SRUL, Structure Repeat Unit Library) direttamente derivata da RepeatsDB, la risorsa di riferimento per lo studio delle TRP. Architetturalmente, ReUPred è basato sulla libreria VICTOR C++, una piattaforma a sorgente aperto per la manipolazione di strutture proteiche. L’accuratezza del predittore è stata validata analizzando la banca dati PDB e le predizione ottenutene sono state successivamente utilizzate per estendere di venti volte il numero di proteine, correttamente annotate, contenute in RepeatDB. Durante lo svolgimento del mio dottorato ho integrato ReUPpred nella nuova versione di RepeatDB (release 2.0), che grazie a questo lavoro, ora integra informazioni dettagliate sulla posizione di inizio e fine per ogni unità ripetuta contenuta nel catalogo. L’interfaccia utente della banca dati è stata aggiornata implementando un nuovo motore di ricerca che permette ora ricerche semantiche complesse. Inoltre, lo stile grafico delle singole schede è stato ridisegnato per una migliore visualizzazione dei dati strutturali. Al fine di migliorare ulteriormente la qualità dei dati contenuti in RepeatDB è stata fornita una classificazione più dettagliata delle unità strutturali ripetute, fino al livello di sottoclasse. Abbiamo ipotizzato che all’interno di questa raccolta di dati fosse possibile identificare sottogruppi di proteine condividenti la stessa unità strutturale di base. Una dettagliata analisi strutturale è stata condotta al fine di validare questa ipotesi. E’ stata generata una rete in cui le singole unità ripetute vengono visualizzate come nodi interconnessi da archi che rappresentano la similarità strutturale. Ne è emerso che l’intero insieme può essere descritto da sette diversi raggruppamenti. Inspirati dalla rappresentazione dei domini proteici usata nella banca dati Pfam, per ognuno dei raggruppamenti è stato derivato un modello di Markov nascosto (Hidden Markov Model). Questa analisi, al momento in via di completamento, ha già permesso di migliorare l’accuratezza di ReUPred ed il livello di annotazione di RepeatsDB. In sintesi, questo lavoro fornisce una robusta base teorica per il futuro sviluppo di nuove tecniche per la predizione di struttura di TRP e può essere di grande aiuto per la comprensione dei meccanismi alla base di patologie umane e per lo sviluppo di nuovi approcci terapeutici
    corecore